A constructive proof of the Lovasz Local Lemma 



Robin A. Moser* 

Institute for Theoretical Computer Science 
Department of Computer Science 
ETH Ziirich, 8092 Ziirich, Switzerland 
robin . moserOinf . ethz . ch 

October 2008 



Abstract 

The Lovasz Local Lemma |EL75| is a powerful tool to prove the existence of combinatorial objects 
meeting a prescribed collection of criteria. The technique can directly be applied to the satisfiability 
problem, yielding that a fc-CNF formula in which each clause has common variables with at most 2*^"^ 
other clauses is always satisfiable. All hitherto known proofs of the Local Lemma are non-constructive 
and do thus not provide a recipe as to how a satisfying assignment to such a formula can be efficiently 
found. In his breakthrough paper |Bec91| . Beck demonstrated that if the neighbourhood of each clause 
be restricted to 0(2'^/^*), a polynomial time algorithm for the search problem exists. Alon simplified and 
randomized his procedure and improved the bound to 0(2'^/^) [Alo91| . Srinivasan presented in [SriOSj 
a variant that achieves a bound of essentially 0(2'^/'*). In |Mos08| . we improved this to 0(2*^/^). In 
the present paper, we give a randomized algorithm that finds a satisfying assignment to every fc-CNF 
formula in which each clause has a neighbourhood of at most the asymptotic optimum of — 1 other 
clauses and that runs in expected time polynomial in the size of the formula, irrespective of fc. If fc is 
considered a constant, we can also give a deterministic variant. In contrast to all previous approaches, 
our analysis does not anymore invoke the standard non-constructive versions of the Local Lemma and 
can therefore be considered an alternative, constructive proof of it. 

Key Words and Phrases. Lovasz Local Lemma, derandomization, bounded occurrence SAT instances, 
hypergraph colouring. 

1 Introduction 

We use the notational framework introduced in |Wel08] . We assume an infinite supply of propositional 
variables. A literal L is a variable x or a complemented variable x. A finite set D of hterals over pairwise 
distinct variables is called a clause. We say that a variable x occurs in D if x G D or x € -D. A finite set F 
of clauses is called a formula in CNF (Conjunctive Normal Form). We say that -F is a fc-CNF formula if 
every clause has size exactly k. We write vbl(-F) to denote the set of all variables occurring in F. Likewise, 
vbl(D) is the set of variables occurring in a clause D. For a literal L, let vbl(L) G vbl(F) refer directly to 
the underlying variable. 

A truth assignment is a function a : vbl(-F) — > {0, 1} which assigns a boolean value to each variable. 
A literal L = x (or L = x is satisfied by a if q(x) = 1 (or a(x) = 0). A clause is satisfied by a if it contains 
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a satisfied literal and a formula is satisfied if all of its clauses are. A formula is satisfiable if there exists a 
satisfying truth assignment to its variables. 

Let /c € N and let F be a /c-CNF formula. The dependency graph of F is defined as G[F] = {V, E) 
with V = F andE = {{C,D} (^V \C D,Yh\{C) n vbl(L») / 0}. The neighbourhood of a given clause 
C in F is defined as the set Tf{C) := {D | D ^ C, vbl(L') H vbl(C) 7^ 0} of all clauses sharing common 
variables with C . It coincides with the set of vertices adjacent to C in G[-F]. The inclusive neighbourhood 
of C is defined to be T+{C) := rF{C)U {C}. 

Suppose U C vbl(F) is an arbitrary subset of the variables occurring in F, then we will denote by 
^{U) •= {C G F j vbl(C) n [/ 7^ 0} the subformula that is affected by these variables. 

If a is any assignment for F, we write vlt(-F, a) to denote the set of clauses which are violated by a. 

The Lovasz Local Lemma was introduced in |EL75j as a tool to prove the existence of combinatorial 
objects meeting a prescribed collection of criteria. A simple, symmetric and uniform version of the Local 
Lemma can be directly formulated in terms of satisfiability. It then reads as follows. 

Theorem 1.1. |EL75] If F is a k- CNF formula such that all of its clauses C £ F satisfy the property 
\Tf{C)\< 2^-^ then F is satisfiable. 

The hitherto known proofs of this statement are non-constructive, meaning that they do not disclose 
an efficient (polynomial-time) method to find a satisfying assignment. Whether there exists any such 
method was a long-standing open problem until Beck presented in his breakthrough paper |Bec91j an 
algorithm that finds a satisfying assignment in polynomial time, at least if |ri?(C)| < 2^^/^^. Using various 
guises of the tools introduced by Beck, there have been several attempts to improve upon the exponent 
[Alo91t IMos06t ISriOSt IMosOSj . but a significant gap always remained. In the present paper we will close 
that gap to the asymptotic optimum. While the tools applied in the analysis are still derived from the 
original approach by Beck and also significantly from the randomization and simplification contributed 
by Alon, the algorithm itself now looks substantially different. An interesting new aspect of the present 
analysis is that the standard non-constructive proofs of the Local Lemma are not invoked anymore. The 
algorithm we present and its proof of correctness can be therefore considered a constructive proof of the 
Local Lemma, or at least of its incarnation for satisfiability. We conjecture that the methods proposed can 
be seamlessly translated to most applications covered by the framework by Molloy and Reed in |MR98j . 
however, this remains to be formally checked. Our main result will be the following. 

Theorem 1.2. If F is a k- CNF formula such that all clauses C £ F satisfy the property \T^{C)\ < 2^~^, 
then F is satisfiable and there exists a randomized algorithm that finds a satisfying assignment to F in 
expected time polynomial in \F\ (independent ofk). 

If we drop the requirement that the algorithm be of polynomial running time for asymptotically growing 
k, then we can also derandomize the procedure. 

Theorem 1.3. Let k be a fixed constant. If F is a k- CNF formula such that all clauses C (z F satisfy 
the property |r^(C)| < 2^~^, then F is satisfiable and there exists a deterministic algorithm that finds a 
satisfying assignment to F in time polynomial in \F\. 

In the sequel we shall prove the two claims. 

2 A randomized algorithm based on local corrections 

The algorithm is as simple and natural as it can get. Basically, we start with a random assignment, then we 
check whether any clauses are violated and if so, we pick one of them and sample another random assignment 
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for the variables in that clause. We continue doing this until we either find a satisfying assignment or if the 
correction procedure takes too much time, we give up and restart with another random assignment. This 
very basic method turns out to be sufficiently strong for the case of formulas with small neighbourhoods, 
the only thing we need to do so as to make a running time analysis possible is to select the clauses to be 
corrected in a somewhat systematic fashion. 

As in the theorem, let -F be a fc-CNF formula over n variables with m clauses, such that VC S F : 
|r^(C)| < d, with d := 2^~^. We impose an arbitrary, globally fixed ordering upon the clauses of -F, let us 
call this the lexicographic ordering. We define a recursive procedure which takes the formula F, a starting 
assignment a and a clause C € F which is violated by a as input, and outputs another assignment that 
arises from a by performing a series of local corrections in the proximity of C. 



function locally .correct (F, a, C) 

a a with the assignments for vbl(C) replaced by random values (u.a.r.); 
while vlt(r+(C),a) 7^ do 

D -i— lexicographically first clause in vlt(rJ(C)); 

a <— locally _correct(F, a, D); 
return a; 



Algorithm 2.1: recursive procedure for local corrections 



As you immediately notice, this recursion has the potential of running forever. We will however see that 
long running times are unlikely to occur. If we are unlucky enough to encounter such a case, we interrupt 
the algorithm prematurely. The following algorithm now uses the described recursive subprocedure in 
order to find a satisfying assignment for the whole formula. 



function solve_lll(F) 

a <— an assignment picked uniformly at random from {0, 
while vlt(F, a) 7^ do 

D ^ lexicographically first clause in vlt(F, a); 
a ^ locally _correct(F, a, D); 

keep track of the number of recursive invocations done by locally .correct; 
if the number exceeds log to + 2, then abort the whole loop and 
restart, sampling another a. 
return a; 



Algorithm 2.2: the complete solver 



If the algorithm terminates, the result clearly constitutes a satisfying assignment. We however have 
to check that the expected running time is polynomial. The remainder of the proof is to establish this 
property. 

In order to be able to talk about the behaviour of the algorithm we need to control the randomness 
injected. Let us formalize the random bits used in a way that will greatly simplify the analysis. Let us say 
that a total function A : vbl(-F) x Nq — > {0, 1} is a table of assignments for F. Let us extend the notion to 
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literals in the natural manner, i.e. A{x, i) := 1—^1(2;, i). Furthermore call a total function a : vbl(-F) — > No 
an indirect assignment. Given a fixed table of assignments, an indirect assignment automatically induces 
a standard truth assignment *a which is defined as *a{x) := A{x,a{x)) for all x € vbl(-F). Let us now, 
just for the analysis, imagine that the algorithm works with indirect assignments instead of standard ones. 
That is, instead of sampling a starting assignment uniformly at random, solve_lll(...) could sample a table 
of assignments A uniformly at random by randomly selecting each of its entries. It will then hand over the 
pair (A, a) to locally .correct (...), where a is an indirect assignment this time which takes zeroes everywhere. 
Note that this is equivalent since *a is uniformly distributed. Each time the value of a variable is supposed 
to be resampled inside locally .correct (...), we instead increase its indirect value by one. Note that this 
equally completely corresponds to sampling a new random value since the corresponding entry of the table 
has never been used before. In the sequel, we will adopt this view of the algorithm and its acquisition of 
randomness. 

We need to be able to record an accurate journal of what the algorithm does. A recursion tree r 
is an (unordered) rooted tree together with a labelling ar '■ V{t) — > F of each vertex with a clause of 
the formula such that if for u,v ^ is the parent node of v, then crT-(f) G T^{ar{u)). We can 

record the actions of the recursive procedure in terms of such a recursion tree, where the root is labelled 
with the clause the procedure was originally called for and all descendant nodes representing the recursive 
invocations and carrying as labels the clauses handed over in those. Let now a table of assignments be 
globally fixed and let a be any indirect assignment and C £ F some clause violated under *a. Suppose 
that locally_correct(...) halts on inputs a and C. Then we say that the complete recursion tree on the given 
input is the complete representation of the recursive process up to the point where it returns. Even if the 
process does not return or does not return in the time we allot, we can capture the tree representing the 
recursive invocations made in any intermediate step and we call these intermediate recursion trees. 

Let r be any recursion tree. The size |r| of r is defined to be the number of vertices. In order to 
simplify notation, we will write [v] := cFr{v) for any v G V{t) to denote the label of vertex v. Let us say 
that a variable x G vbl(F) occurs in r if there exists a vertex v G V{t) such that x G vbl([f]). We write 
vbl(r) to denote the set of variables that occur in r. 

We are now ready to make a first statement about the correctness of the algorithm. 

Lemma 2.1. Let F and A he globally fixed. Let a be any indirect assignment and C (z F a clause violated 
under *a. Suppose locally .correct (...) halts on input a and C and let t be the complete recursion tree for 
this invocation. Then the assignment a' which the function outputs satisfies the subformula F^^^i^^^y 

Proof. Assume that a' violates any clause D G -f^(vbl(r))' Suppose furthermore that during the process we 
have recorded the ordering in which the recursive invocations were made. Now let v G V{t) be the last 
vertex (according to that ordering) of which the corresponding label [v] shares common variables with D. 
Since any invocation on input [v] can only return once all clauses in rj([f]) are satisfied, it cannot return 
before D G rj([z;]) is. Since after that no changes of the assignments for the variables in D have occured 
(by choice oi v), D is still satisfied when the function returns, a contradiction. □ 

In the proof we had to assume having remembered the ordering in which the process generated the 
vertices. However, from the statement of the lemma we can now infer that this is not necessary since that 
ordering can be reconstructed by just looking at the shape and the labels. For any recursion tree r, let 
us define the natural ordering vr^ : V{t) [|^(''")|] to be the ordering we obtain by starting a depth-first 
search at the root and at every node, selecting among the not-yet traversed children the one with the 
lexicographically first label (we use the notation [n] := {1, 2, . . . , n} for n G N). We claim the following. 

Lemma 2.2. Let r be any (intermediate) recursion tree produced by any (possibly non-terminating) call to 
locally_correct (...). The ordering in which the process made recursive invocations coincides with the natural 
ordering of r and in particular, there is no node with two identically labelled children. 
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Proof. Note that a recursive procedure naturally generates a recursion tree in depth-first search order. 
What we have to check is merely that the children of every node are generated according to the lexicographic 
ordering of the their labels. We proceed by induction. The recursion tree r has a root r labelled [r] and a 
series of children. Assume as induction hypothesis that the subtrees rooted at the children have the required 
property. We have to prove that on the highest invocation level where the loop checks for unsatisfied clauses 
in r^([r]), no clause is either picked twice or is picked despite the fact that it precedes a clause already 
picked during the same loop in the lexicographic ordering. Both possibilities are excluded by Lemma |2.H 
which readily implies that after a recursive invocation returns, the set of violated clauses is a strict subset 
of the set of clauses violated before and in particular the clause we made the call for is satisfied ever after. 
It can easily be verified that the fact that we allow intermediate trees does not harm. □ 

Lemma |2. II also implies that the maximum number of times the outer loop (in solveJll (...)) needs to 
be repeated is bounded by 0(m) since the total number of violated clauses cannot be any larger and each 
call to the correction procedure eliminates at least one. Since we abort the recursion whenever it takes 
more than logarithmically many invocations, clearly the whole loop terminates after a polynomial number 
of steps. The only thing left to prove is that it is not necessary to abort and jump back to the beginning 
more often than a polynomial number of times. In the sequel, we will show that in the expected case, this 
has to be done at most twice. 

3 Consistency and composite witnesses 

Suppose that we fix an assignment table A and an indirect assignment a and we call locally .correct (...) 
for some violated clause C and wait. Suppose that the function does not return within the log(m) + 2 
steps allotted and we abort. Now let r be the recursion tree that has been produced up to the time of 
interruption, r has at least log(m) + 2 vertices (up to some rounding issues it has exactly that many). 
Suppose that somebody were to be convinced that it was really necessary for the process to take that long, 
then we can present them the tree r as a justification. By inspecting r together with a, they can verify that 
for the given table A, the correction could not have been completed any faster. Such a certification concept 
will now allow us to estimate the probability of abortion. We have to introduce some formal notions. 

Let V € V{t) be any vertex and x E vbl([f]) a variable that occurs there. We define the occurrence 
index of x in v to be the number of times that x has occurred before in the tree, written 

idxrix,v) := \{v' G V{t) \ TTr{v') <7rr{v), x £ vbl([t;'])}|. 

If 5 is any indirect assignment, we say that r is consistent with A offset by 6 if the property 

Vv G V{t) : VL G [v] : yi(L, idx^(vbl(L), v) + (5(x)) = 

holds. Furthermore, let us define the offset assignment induced by r as 

6r{x) := \{v G V{t) I X G vbl(H)}| 

for all X G vbl(F). Recall now what the recursion procedure does; it starts with a given indirect assignment 
and performs recursive invocations in the natural ordering of the produced recursion tree and in each 
invocation, the indirect assignments of the variables in the corresponding clause are incremented. This 
immediately yields the following observation. 

Observation 3.1. If t is any (intermediate) recursion tree of an invocation of the recursive procedure for 
a table A and an indirect starting assignment a, then r is consistent with A offset by a. 

Let a collection W := {ti , r2 rt} of recursion trees be given of which the roots have pairwise distinct 
labels. Consider an auxiliary graph "Ky/ of which those trees are the vertices and two distinct vertices Tj, Tj 
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are connected by an edge if the corresponding two trees share a common variable, i.e. vbl(Tj) n vbl(rj) 7^ 0. 
If "Kw is connected, then W is said to be a composite witness for F. The vertex set of a composite witness 
is defined to be y(l^) := {{t,v) \ t ^W,v £ V{t)}, i.e. it is the set of all vertices in any of the separate 
trees, each annotated by its tree of origin. To shorten notation, we access the label of such a vertex by 
writing [w] := [v] for each w = {t,v) € V(W). The size of a composite witness is defined to be 

There is a natural way of traversing the vertices in Consider that each of the recursion trees 

Tj has a distinctly labelled root. Now order the trees according to the lexicographic ordering of the root 
labels. Traverse the first tree according to its natural ordering, then the second one, and so forth. We call 
this the natural ordering for a composite witness and we write irw : V{W) — > to denote it. Let 

V G be any vertex of the composite witness and x € vbl([u]) any variable that occurs there. We 

define the occurrence index of x in v to be the number of times that x has occurred before in the witness, 
written 

idxw{x,v) := \{v' G V{t) \ vriy(v') < 7^w{v), x G vbl([t;'])}|. 
We say that a composite witness is consistent with A if the property 

Vv G V{W) : VL G [v] : idxH/(vbl(L), ?;)) = 

holds. The definitions immediately imply the following. 

Observation 3.2. IfW = {ri,r2, . . . ,rt} is a composite witness with the recursion trees ordered according 
to the lexicographical ordering of their root labels, then W is consistent with A if and only if for alll < i < t, 
Tj is consistent with A offset by J2j<i^Tj- 

Moreover, note that for vertices v G y(M^) and literals L G [v], the mapping 

{v,L) ^ (vbl(L),idxvi/(vbl(L),7;)) 

is, by definition of idx^y, an injection. This implies that if we are given a fixed composite witness W and 
we want to check whether W is consistent with a table A, then for each pair {v, L) to be checked, we 
have to look up one distinct entry in the table. In total, we have to look up /c|y(VK)| entries and W can 
be consistent with A exclusively if each of those entries evaluates exactly as prescribed. This yields the 
following. 

Observation 3.3. If A is a random table of assignments where each entry has been selected uniformly at 
random from {0, 1}, then the probability that a given fixed composite witness W becomes consistent with A 
is exactly 2-'=l^(^)l. 

In fact, we will now see that whenever the superintending procedure has to interrupt the recursion 
because it has made too many invocations, then the bad luck at the origin of this behaviour can be certified 
by means of a large composite witness which occurs very rarely. We say that a composite witness is large 
if it has size at least logm + 2. We claim the following. 

Lemma 3.4. Let an assignment table A be fixed and now run the loop in solveJll (...), i.e. starting 
with the indirect all-zero assignment, repeatedly call the correction procedure on the lexicographically first 
clause violated by the current assignment and replace the assignment by the returned one. If the procedure 
fails, that is if any of the local correction steps have to be interrupted because it needed at least log m + 2 
invocations, then there exists a large composite witness for F that is consistent with A. 

Proof. Let ri,r2, . . . ,rt be the sequence of recursion trees that certify the recursive processes we started. 
Note that since the last step had to be interrupted, Tt is intermediate and of size at least logm + 2. The 
other trees are complete recursion trees. Moreover, note that by Lemma |2.H every completed recursion 
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process cannot have introduced any new violated clauses and therefore, the root labels of the trees are 
ordered lexicographically. 

Let now !K be an auxiliary graph of which {ri, T2, ■ ■ ■ ,Tt} is the vertex set and two trees are connected 
by an edge if there is a variable that occurs in both of them. Let us identify the connected components of 
IK and let W C {ti,T2, . . . , r^} be the connected component of which rt is part. Note that all trees r' that 
are not in W do not have any variables in common with any tree in W, therefore all their induced offset 
assignments 5r' are zero on all of vbl(W). By Observation 13. 11 it follows that ifW = {Ti-^^,Ti^, ■ ■ ■ ,Ti^ = Tt}, 
where the indices preserve the ordering, then for all 1 < j < r, the tree Tj. is consistent with A offset by 
^ •/^ ^Ti • Bv Observation [3121 this vields the claim. □ 

Since we already know that a fixed composite witness is unlikely to become consistent with a random 
table, it just remains to count the number of composite witnesses that can possibly exist for F. This 
will yield a bound on the probability that there is no large composite witness consistent with the table 
at all and thence also on the probability that each local correction will go through without the need for 
premature interruption. 



4 Encoding and counting composite witnesses 

In order to give an upper bound on the number of composite witnesses for a given formula, we will think 
about how they can be encoded efficiently. The most obvious way of just encoding each recursion tree 
separately by giving the shape of the tree and its labels does unfortunately not suffice. There are two 
properties of witnesses that we can exploit. Firstly, the labellings of the separate trees are not arbitrary 
but follow the rule that the label of a child is always either a neighbour in G[F] to the label of the parent 
or identical to that label. Moreover, the fact that the auxiliary graph that describes the interconnectivity 
of the recursion trees inside the composite witness has to be connected yields that those trees have to lie 
in proximity of one another such that only one root vertex for the whole composite witness will have to be 
stored. 

More formally. Let d be the infinite rooted (2d)-ary tree. Note that every node in that tree has 
at least twice as many children as there can be clauses in r^(C) for any C ^ F. Let now R ^ F he 
any clause. We will 'root', so to speak, 3 at R and starting from there, 'embed' the nodes of 3 into 
G[F] in the following fashion. Let an : 3 ^ G[F] he a labelling of the vertices of 3 such that the 
root r is labelled aR{r) = R and whenever a node v € V{3) is labelled by crji(v) = D £ F, then the 
2d children of v, call them ci,C2, ■ ■ ■ ,C2d, are labelled as follows: for I < i < \T~^{D)\ < d, we label 
a{ci) := (the lexicographically i — th clause of r^(D)). For d + l<i<d + \Tp{D)\ < 2d, we label 
a{ci) := cr(cj_rf) in the same ordering. If there are remaining children then these are not labelled, that is 
the map is partial. We will call the first d children of every node the low children and the remaining d 
children the high children of that node. 

Let us now say that a triple {C,T,c), where C € -F is a clause, T is a subtree of 3 containing the 
root and c is a 2-colouring (or 2-partition) of the edges of T is a witness encoding. The size of a witness 
encoding is defined to be the number of vertices in the tree, |y(r)|. We will see that we can reversibly 
encode each composite witness as such a triple and this will yield a bound that is strong enough for our 
purposes. 

Lemma 4.1. Let u G N. There is an injection from the set of composite witnesses of size exactly u for F 
into the set of witness encodings of size u. The total number of distinct composite witnesses of size exactly 
u that exist for F is upper hounded by m ■ 2"('^~^). 

Proof. Let W he any composite witness for F. To prove the claim, we have to transform W into a triple 
(C, T, c) as described above in a reversible fashion. 
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To begin with, we will show that any recursion tree r can be encoded as a subtree of size |r| of J which 
is rooted at any vertex v oi3 which carries a label that also occurs in r, i.e. (Tpi{v) € lm{ar)- The encoding 
is reversible for any choice of R and any choice of such vertex v. We proceed as follows. Let R and v be 
chosen arbitrarily and let then u S V{t) be any vertex such that aii{v) = [u]. Let u = uo,ui, . . . ,Uj be 
the succession of vertices that we encounter on the shortest path from u to the root, where Uj is the root 
of T. We now start producing a subtree of 3 that is rooted at v. We add the unique high child ci of v with 
o'k(ci) = [ui]. This child exists because obviously [ui] is a neighbour of [uq] in G[F]. To ci as the parent, 
we then add the unique high child C2 such that ct/j(c2) = [^2], and so forth. We have built a path from v 
to a vertex Cj that is labelled [uj], just like the root of r. Now, in the most natural fashion, attach to v all 
subtrees of u in such a way that each descendant is labelled the same way in J as the corresponding vertex 
is labelled in r. Always use the uniquely defined low children for this kind of embedding. Attach then to 
ci all the subtrees of ui except for the one rooted at uq that we have already handled, using the same type 
of embedding. Do the same for C2, . . . , Cj. The result is a subtree of J of size |t| which is rooted at v with 
a one-to-one correspondence of its vertices to the vertices of r and such that every vertex is labelled under 
(Tr in the same way as the corresponding vertex is labelled in r. Note that this transformation is reversible. 
Whenever we are given such a subtree, we can start at its root vertex and follow the unique high child 
of every node until not possible anymore. The vertex we end up at corresponds to the original root of 
r. It is clear that from there, we can perform the same succession of 'rotations' backwards to completely 
reconstruct r. 

Consider the composite witness now. Let W = {ti,T2, ■ ■ ■ ,Tt}. Let Jfvi/ be, as usual, the auxiliary 
graph where those recursion trees are the vertices and an edge exists between any two trees if a common 
variable occurs in both of them. By the definition of composite witnesses, "Kw is connected. This means 
that we can easily exhibit an ordering of the vertices of 'Kw such that each vertex in that ordering is 
connected to at least one of the vertices listed previously. W.l.o.g., let the ordering ri, r2, . . . , tj have this 
property. It need of course not coincide with the lexicographic ordering of the root labels. 

Define both C and R to be the label of the root of ri. This is going to be the root vertex of the whole 
construction. The labellings of 3 will now be a^. We encode the recursion trees as subtrees of J in the 
ordering just devised and we glue them together to form one large tree. Start with ti and encode it in the 
way described above as a subtree of J which is rooted at the root of J. This is possible because we have 
just labelled the root in this way. Now assume we have added all trees up to 1 < i < t, producing so far a 
subtree Tj of J and now we have to add Tj+i. By the way we have ordered the trees, there exists a variable 
X € vbl(rj+i) n vbl(rj) for some j < i + I. Since the encoding as Tj has preserved all the labels, there 
is also at least one vertex in Tj of which the label contains x. Let g he a deepest such vertex in Tj, i.e. 
X G vhl{aFi,{g)) but no descendant of g in Tj has a label which contains x. Let rj+i € V(ri+i) be any vertex 
in Tj+i which contains x. Clearly, [rj+i] and (JR{g) are either identical or neighbours in therefore 
there exists a unique low child Cg of g (in J) which is labelled (JR^Cg) = Note that by choice of g, Cg 

is not present in Tj. Use the procedure described before to produce a representation of Tj+i as a subtree of 
3 which is rooted at Cg. Add this representation to Tj and connect the root Cg to the parent g by an edge. 
Mark this edge as a special glueing edge by means of the colouring c. All other edges are marked regular 
by that colouring. Continue the process until all trees are added. Note that the tree T := produced 
contains exactly vertices. 

It can easily be seen that the construction is reversible. If any such subtree, the root label C = R 
and the colouring are given, simply reconstruct all the labels of 3 as aji, then delete all glueing edges to 
produce a disjoint subforest of 3. Each of the subtrees can be retransformed into the original recursion 
trees with the correct roots by the reversal process described above. 

Now that we have found such an encoding, determining the numbers is not difficult anymore. There are 
m choices to select C & F. According to a simple counting exercise by Donald Knuth in |Knu69j . the number 
of rooted subtrees of size exactly n in J does not exceed (2e(i)". There are less than 2" distinct 2-colourings of 
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the edges of such a tree. Therefore, the total number is upper bounded by m(4e(i)" < m(16(i)" = m{2'' ^)", 
as claimed. □ 

Since the composite witnesses are little in number, we can now infer that their occurrence is sufficiently 
unlikely. 

Lemma 4.2. Consider an assignment table A produced by selecting each entry from {0, 1} uniformly and 
independently at random. The probability that there exists in F a large composite witness which is consistent 
with A is at most 1/2. 

Proof. Let be the random variable that counts the number of composite witnesses of size exactly u 
which are consistent with A. We have already derived that the probability that a fixed composite witness 
of size u becomes consistent with A is 2"^^". Combining this with the previous lemma yields the following 
bound on the expected value of X^. 

E{Xu) < 2"'=" • m ■ 2"('=-^) = m2-". 

If X denotes the random variable that counts the total number of large composite witnesses which are 
consistent with A, then for the expected value of X we obtain 

E{X)= X„ <m-2-^°g'"^2-" = -. 

M>log{m)+2 u>2 

If the number of such witnesses is at most 1/2 on average, then for at least half of the assignment tables, 
there is no consistent such witness at all. □ 

This concludes the proof of Theorem 11.21 



5 A deterministic variant 

In this section we demonstrate that there is nothing inherently randomized to our procedure. If we assume 
A: to be a constant, we can give a determinstic polynomial-time algorithm for the problem. The key idea will 
be that we enumerate all possible large composite witnesses and then instead of sampling an assignment 
table at random and hoping that it will avoid all of them, we deterministically search for a table that does. 
Since we can only enumerate a polynomial number of composite witnesses, we have to more carefully check 
which of them are really relevant. The following lemma says that it is not necessary to check arbitrarily 
large witnesses. 

Lemma 5.1. Let u € N and let A be a fixed table of assignments. If there exists a composite witness of 
size at least u for F that is consistent with A, then there exists also a composite witness of a size in the 
range [u, ku + 1] which is equally consistent with A. 

Proof. Assume that the claim is fallacious for a fixed value u € N. Then assume that is a composite 
witness that constitutes a smallest counterexample, i.e. is a composite witness of size at least u which 
is consistent with A, but there exists no witness of a size in the range [u, ku + 1] which is consistent with 
A and W is smallest with this property. Clearly, this implies that W has size larger than ku + 1. 

Let W = {ti,T2, . . . ,Tt} be the list of recursion trees contained in W, sorted according to the lexi- 
cographic ordering of the root vertices. Now let us remove the very last vertex according to the natural 
ordering, i.e. the vertex vr,^^(|y(W)|). If consists of a singleton vertex, this amounts to deleting Tt from 
the witness, in all other cases it amounts to removing the last vertex in the natural ordering from Tt- Let 
in both cases W* be the modified collection of recursion trees. Now check the auxiliary graph !Kvy* with 
W* being the vertex set and edges in case of common variables. Due to the deletion of the last vertex 
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and therefore possibly of a clause from the set of labels of W*, the graph "Kyy* might have fallen apart 
into several connected components. However, considering that in the removed clause there were exactly k 
variables and the variable sets of the connected components are disjoint, the number of such components 
cannot exceed k. If we now select the largest component W of "Kw*, where large in this case means having 
the maximum number of vertices summed over the trees in that component, then W' is by definition again 
a composite witness and by choice of W, \V(W')\ > {\V{W) \ — l)/k > u. Note also that W is consistent 
with A because neither the removal of the last vertex in the natural ordering, nor the removal of any 
trees covering variable sets disjoint from the set of variables covered by W can influence the consistency 
property. 

In total, we have found a composite witness W' which is strictly smaller than W and also consistent 
with A. Either the size of W is in the range [u, ku+1], a contradiction, or it is a smaller counterexample, 
a contradiction as well. □ 

The lemma implies that if we want to check whether a given assignment table A has the potential 
to provoque a premature interruption of some correction step, then it suffices to check that no composite 
witness with a size in the range [logm + 2, fc(logm + 2) + 1] exists that is consistent with A. The number 
of composite witnesses with a size in that range is, by Lemma HT} bounded by 



which is in turn bounded by a polynomial in m. This polynomially large set of critical composite witnesses 
can obviously be enumerated by a polynomial time algorithm. 

Assume now that we do not simply want to check tables for consistency with certain witnesses but we 
would like to directly produce an assignment table with which no large composite witness is consistent. 
Note that during the algorithm, no variable's indirect assignment is incremented more than m ■ (log m + 2) 
times and therefore at most this number of rows in the assignment table is used. Let us then define boolean 
variables Zi^x for all i € [0, m • (logm + 2)] and all x € vbl(F) to represent the entry A{x, i) of the table. If 
W is any fixed composite witness, then we can in the obvious way translate the consistency property for 
W into a clause Cw over the variables Zi^^ such that a truth assignment /3 to the variables Zi^x violates Cw 
if and only if the corresponding table with A{x,i) = (3{zi^x) for all i and x is consistent with W. Clearly, 
the clause Cw has size exactly fc|y(T^)|. 

Let now a determinstic algorithm enumerate all composite witnesses for F with a size in the range 
[log m + 2, A; (log m + 2) + 1] , for each of them, generate a corresponding clause and collect all those clauses 
in a CNF formula G of length polynomial in m. By the same calculation as in the proof of Lemma [4.2^ the 
expected number of violated clauses in G if we sample a truth assignment (5 for G uniformly at random, is 
smaller than 1/2. It is well-known that a formula with this property can be solved by a polynomial-time 
determinstic algorithm using the method of conditional expectations (see, e.g., |Bec91j ). We can therefore 
obtain, deterministically and in polynomial time, an assignment (3 that satisfies G. The values of (3 provide 
us with a corresponding assignment table A with which no composite witness with a size in the range 
[logm + 2, fc(logm + 2) + 1] and by Lemma l5. II therefore no large composite witness at all is consistent. If 
we now invoke our algorithm, replacing all random values by the fixed table A, we are guaranteed that no 
interruption will occur and all corrections will go through to produce a satisfying assignment of F. 

This concludes the proof of Theorem 11.31 
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A;(logm+2)+l 




m • 2^('=-i) < {{k - l)(log m + 2) + 2) • m • 2(^(i°g'"+2)+i)(fc-i) 



j=log m+2 
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